The number of small blocks in exchangeable random partitions 



by Jason Schweinsberg* 
University of California, San Diego 

July 14, 2010 



Abstract 

Suppose II is an exchangeable random partition of the positive integers and IT n is its re- 
striction to {1, . . . , n}. Let K n denote the number of blocks of n„, and let K n r denote the 
number of blocks of II„ containing r integers. We show that if < a < 1 and K n /(n a £(n)) 
converges in probability to T(l — a), where £ is a slowly varying function, then K n ^ r j (n a £(n)) 
converges in probability to aT(r — a)/r\. This result was previously known when the conver- 
gence of K n /(n a £(n)) holds almost surely, but the result under the hypothesis of convergence 
in probability has significant implications for coalescent theory. We also show that a related 
conjecture for the case when K n grows only slightly slower than n fails to be true. 

1 Introduction 

We begin by recalling some basic facts about exchangeable random partitions. Suppose it is a 
partition of the set N of positive integers. If a is a permutation of N, then we can define a 
partition an such that the integers a(i) and a(j) are in the same block of air if and only if i and 
j are in the same block of tt. A random partition n if N is said to be exchangeable if an and n 
have the same distribution for all permutations a of N having the property that a(j) = j for all 
but finitely many j. 

In 1978, Kingman |14| proved an analog of de Finetti's Theorem that characterizes all possible 
exchangeable random partitions. He showed that there is a one-to-one correspondence between 
distributions of exchangeable random partitions and probability measures on the infinite simplex 
A = {(xi)^ zl : x\ > X2 > • • • > 0,J2iZi x i ^ !}• Given a probability distribution fi on A, 
the associated exchangeable random partition is constructed as follows. First, choose a random 
sequence (Pj)j^i with distribution fi. Then define random variables (6c)^Li that are conditionally 
independent given (Pj)^ and satisfy P(£ k = i\{Pj)f =1 ) = Pi and P(£ k = -k\(Pj)°° =1 ) = 1 - 
Sj^=i Pj- Finally, define n to be the random partition of N such that two integers i and j are in 
the same block of n if and only if £j = £j . 

It follows from this construction and the Law of Large Numbers that if B is a block of an 
exchangeable random partition n, then the asymptotic frequency of the block, defined by 

1 n 

lim -Vl{i e B}, 

i=l 
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exists almost surely. The nonzero asymptotic frequencies of the blocks of II are the nonzero terms 
of the sequence [Pj)JLi- Each integer is in a block having positive asymptotic frequency with 
probability Y^jLi Pj an d is in a singleton block with probability 1 — Y^jLi Pj- 

Given an exchangeable random partition II of N, let Tl n denote its restriction to {1, . . . , n}. 
That is, II n is the partition of {1, . . . ,n} such that two integers i and j in {1, . . . ,n} are in 
the same block of II n if and only if they are in the same block of II. Let K n be the number 
of blocks of II n , and let K n ^ r be the number of blocks of n n having size r. In this paper, we 
show how asymptotic results for the random variables K nyT as n — > oo can be deduced from the 
asymptotic behavior of K n . Such results have already been proved, and are summarized in 
for the case in which the asymptotic frequencies Pj are deterministic and sum to one. This is 
the setting of the classical infinite occupancy problem, in which infinitely many balls are placed 
independently into infinitely many boxes, with each ball going into the jth box with probability 
Pj. Here we extend these results to the general case of random Pj and explore the applications 
of this extension to coalescent theory and population genetics. 

We note that in addition to the results below concerning the asymptotic behavior of K ntJ ., 
Central Limit Theorems have been established for the number of small blocks in exchangeable 
random partitions under certain conditions. See |13| for some early work in this direction and [2] 
for some recent extensions. 



1.1 The power law case 

We first consider the case in which the number of blocks K n grows like n a , where < a < 1. 
The proposition below is essentially due to Karlin |13j . More precisely, it follows from combining 
Theorem 1 of |13j with a Tauberian theorem. The result also appears as Corollary 21 in the 
recent survey |llj . Recall that a measurable function t : (0, oo) — > (0,oo) is said to be slowly 
varying if for all c > 0, we have lim^oo £{cy)/£(y) = 1. 

Proposition 1. Let (pj)j^i be a deterministic sequence such that p\ > P2 > • • • > and 
"Y^jLiPj = 1- For x > 0, let g(x) = max{j : pj > x}. Let II be an exchangeable random partition 
of N whose asymptotic block frequencies are given by (j>j)j*Li almost surely, and define K n and 
K njr as above. Suppose < a < 1. Suppose £ : (0, oo) — > (0, oo) is a slowly varying function. We 
have 

^44 = 1 (1) 

x^O l(l/x) V ^ 

if and only if 

lim = T(l - a) a.s. (2) 

These two statements imply that for all r £ N, we have 

K n>r aT(r - a) 
lim — -7— = ; a.s. (3) 

?woo n a £[n) r\ 

Our main theorem is an extension of Proposition[T]to general exchangeable random partitions. 
It is an immediate consequence of Proposition [1] that even when the Pj may be random, the 
condition ([2]) implies ([3]). The result below says that this implication remains valid even when 
the convergence in ([2]) holds only in probability. As we will see shortly, this result has applications 
in coalescent theory, where it can be much easier to establish convergence in probability for K n 
than almost sure convergence. 
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Theorem 2. Suppose H is an exchangeable random partition of N, and define K n and K n>r as 
above. Suppose < a < 1, and suppose £ : (0, oo) — > (0, oo) is a slowly varying function. If 

K 

lim — — = r(l — a) in probability (4) 

rwoo n a i(n) 

then for all r £ N, we have 

lim — 'jfr = ° - — — in probability. 

n-too n a £(n) r\ 

We prove Theorem [2] in Section [2j It will follow from this proof (see Lemma [12] below) that 
@ implies that the limit ([T|) holds in probability. However, the converse implication is false. 
Of course, it is clear that the converse can not hold for general exchangeable random partitions 
because ([I]) can hold even when Y2^=iPj < 1> hi which case K n will be of order n rather than 
of order n a . However, as the next example shows, even under the additional condition that 
Yl'jLi Pj = 1) ^ is possible for the limit ([T]) to hold in probability but for ([4]) to fail. 



Example 3. There exists an exchangeable random partition n o/N whose asymptotic frequencies 
<r- 



satisfy Yl'jLi Pj = 1 a - s - suc h that if G{x) = max{j : Pj > x}, then 



lim x a G(x) = 1 in probability 

but n~ a K n does not converge in probability to T(l — a) as n — > oo. 

We describe the example in detail, and prove that it has the stated properties, in Section OH 

1.2 The case in which K n is only slightly smaller than n 

Proposition Q] and Theorem [2] give asymptotic results for K n ^ r when K n grows like n a for < 
a < 1. The result below concerns the case when K n grows just slightly slower than n. This 
result can be obtained from results in [11] by combining Propositions 14 and 18 with Lemma 1, 
Proposition 2, and the remarks before and after Proposition 2. 

Proposition 4. Let (pj) ( jL 1 be a deterministic sequence such that p\ > p2 > • • • > and 
Y^jLiPj = 1- F° r ^ > 0, let g(x) = max{j : pj > x}. Let H be an exchangeable random partition 
of N whose asymptotic block frequencies are given by {pj)JL 1 almost surely, and define K n and 
K ntr as above. Suppose I : (0, cxj) — > (0, oo) is a slowly varying function, and for t > 0, let 
£i(t) = £{s)/sds. Suppose that 

lim j7jj\ = 1- (5) 



Then 



Also, for integers r > 2, 



lim Tj \ = Hp ttS = 1 a - s - ( 6 ) 



n— ¥oo 



K , r 1 

lim ""' f = — a.s. (7) 

n^oo n£(n) r(r — 1) 
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Our next result addresses a question that is left open by Proposition [H Although ([5]) implies 
© and ([7]) , one can also ask whether there is a result parallel to Theorem [2] in which we obtain 
asymptotic results for K n ^ r just from the asymptotics of K n . However, the example below, which 
we describe in detail in Section^ shows that the condition K n j{ni\{n)) — > 1 a.s. is not sufficient 
to imply that the convergence in ([7]) holds, even in probability. Note that in the notation of 
Proposition H if £(t) = (logt)" 2 for alU > T > 1, then t x {t) = (logt)" 1 for all t > T. 

Example 5. There exists an exchangeable random partition II of N such that if K n and K n ^ T 
are defined as above, then 

hm {l ° gn)Kn = 1 a.s., 
n— >oo n 

but for all integers r > 2, the quantity n _1 (log n) 2 K r ^ n does not converge to l/[r(r — 1)] in 
probability as n — >■ oo. 

1.3 Applications to coalescent theory and population genetics 

At first glance, Theorem [2] may appear to be only a very minor technical improvement over 
Proposition [TJ However, Theorem [2] has significant implications for coalescent theory, where it 
can be much easier to prove convergence in probability and establish (|4|) than to prove the almost 
sure convergence needed to obtain ([2]). 

Suppose we take a sample of size n from a population and follow the ancestral lines of the 
sampled individuals backwards in time. The ancestral lines will coalesce until all of the sampled 
individuals are traced back to a single common ancestor. This process can be modeled by a 
stochastic process taking its values in the set of partitions of {1, . . . , n}. The standard coalescent 
model is Kingman's coalescent [15], in which it is assumed that only two lineages ever merge 
at a time and each transition that involves the merging of two lineages happens at rate one. 
This means that when there are b lineages, the amount of time before the next merger has an 
exponential distribution with rate (t) . 

Within the last decade, there has been considerable interest in alternative models of coales- 
cence, called coalescents with multiple mergers or A-coalescents, that allow many ancestral lines 
to merge at once. Such processes were introduced by Pitman [T7] and Sagitov |18| . If A is a 
finite measure on [0, 1], then the A-coalescent is the coalescent process having the property that 
whenever there are b lineages, each transition that involves k lineages merging into one happens 
at rate ^ 

X b ,k= [ x k - 2 (l-x) b - k A(dx). 
Jo 

Multiple mergers of ancestral lines could arise in populations with large family sizes, as many 
ancestral lines could be traced back to the individual that had a large number of offspring. They 
could also arise as a result of natural selection because many ancestral lines could get traced back 
to an individual that had a beneficial mutation which spread rapidly to a large fraction of the 
population. 

To model mutations, we put marks representing mutations at points of a rate 9 Poisson process 
along each branch of the coalescent tree. One can then define a random partition U n of {1, . . . , n}, 
often called the allelic partition, by declaring i and j to be in the same block of H n if and only 
if the ith and jth sampled individuals inherit the same mutations. These partitions Tl n can 
be defined consistently as n increases simply by sampling more individuals, so by Kolmogorov's 
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Figure 1: This figure shows the genealogy of five sampled individuals. The boxes represent muta- 
tions. Individual 1 inherited no mutations, individual 2 inherited mutation C, individual 3 inherited 
mutation A, and individuals 4 and 5 inherited mutations A and B. Therefore, the allelic partition is 
n 5 = {{1}, {2}, {3}, {4, 5}}. We have K 5 = 4. Also, K 5jl = 3, A 5 , 2 = 1, and K 5>3 = K 5A = K 5>5 = 0. 



Extension Theorem, on some probability space there is an exchangeable random partition IT of 
N such that Tl n is the restriction to IT of {1, . . . , n}. When the underlying coalescent process is 
Kingman's coalescent, the distribution of IT n is given by the Ewens Sampling Formula [10] . The 
probability that U n has a,j blocks of size j for j = 1, . . . , n is given by 



in 



20(29 + 1) . . . (20 + ra - 1) £1 ^ j 



26\ aj 1 



When the underlying coalescent process is some other A-coalescent, there is no simple expression 
for the distribution of IT. However, denning K n and K n)T from H„ as above, it was shown in [5j[6] 
that if A is the Beta(a, 2 — a) distribution with < a < 1, then 

K n fl(2-a)(l-a)r(2- a ) . 
hm = in probability. (8) 

n->oo n a a 

It was then shown in [6] that 

Km = »P - °)T(r-«») in probabllity . (9) 

Note that a here corresponds to 2 — a in [5] and [6]. The proof of ([9]) in [B] is rather technical, 
exploiting a connection between beta coalescents and the genealogy of continuous-state branching 
processes. However, Theorem [2] makes it possible to deduce ([9]) immediately from ([8]). We also 
note that the convergence in ([8]) was later shown in [4J to hold almost surely, allowing Q to 
be established via Proposition [TJ On the other hand, if A is the uniform distribution on [0, 1], 
corresponding to a = 1 above, it was shown in [3], building on work of [9], that 

l im ( l °Sn)K n = e . n probabmty (1Q) 

n— >oo n 
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It was also shown in [3] that 



lim 



(log n) 2 K, 
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in probability. 



(11) 



n 



fc(Jfe-l) 



However, Example [5] establishes that (fTUJ) does not imply (fTT|) . Indeed, the proof of (fTTj) in [3] 
involves a detailed analysis of a Markov chain on different time scales. 

1.4 A model of a growing population 

To illustrate another application of Theorem [2J we consider the following model of a population 
that grows in size over time. Fix 7 > and a positive integer N . Assume that for each positive 
integer k, there are [\/V&;~ 7 ] individuals in generation —k. For simplicity, assume that the number 
of individuals in generation zero is the same as the number of individuals in generation -1, so 
there are N individuals in generations and 1 but fewer in earlier generations. To give the model 
a genealogical structure, we assume, as in the standard Wright-Fisher model, that each individual 
chooses its parent uniformly at random from the individuals in the previous generation. 

Now sample n individuals from the population at time zero, and follow their ancestral lines 
backwards in time. We can represent the genealogy of these sampled individuals by a coalescent 
process (^N,n(t),t > 0) taking its values in the set of partitions of {1, . . . , n}, where two integers 
i and j are in the same block of the partition VP/v,n(i) if an d only if the ith and j'th individuals 
in the sample have the same ancestor at time — [A^ 1 ^ 1+7 Hj. It is easy to check that as N — > 00, 
these processes converge to a coalescent process (^ n (t),t > 0) having the property that at time 
t, two lineages (that is, two blocks of the partition) are merging at rate i 7 . To see this, note that 
in generation N^'^^t, two individuals have the same ancestor with probability approximately 
AT-i(ArV(i+7)t)7 ) an d multiplying this expression by the time-scaling factor jV /' + " gives the 
coalescence rate of t 7 . Note that (fy n (t),t > 0) is a time-inhomogeneous Markov chain. 

The process ("if n (t),t > 0) can be obtained as a time-change of Kingman's coalescent. Indeed, 
let (0 n (t),i > 0) be Kingman's coalescent started with n lineages. That is (Q n (t),t > 0) is 
a continuous-time, time-homogeneous Markov chain taking values in the set of partitions of 
{l,...,n} such that G n (0) = {{1}, {2}, . . . , {n}}, each transition that involves merging two 
blocks of the partition happens at rate one, and no other transitions are possible. Then we can 
define 



The time change makes ^ n a time-inhomogeneous Markov chain in which at time t, each pair of 
blocks is merging at rate i 7 . 

We will now work with the coalescent process (ty n (t),t > 0) and, as before, put mutations 
along each lineage at times of a rate 9 Poisson process. Then define the partition II„ such that 
i and j are in the same block of Il n if and only if the ith and jth sampled individuals inherit 
the same mutations. The partitions H n can be defined consistently as n varies, so there is an 
exchangeable random partition II of N such that Tl n is the restriction of II to {1, . . . , n}. Define 
K n and K nr as before. We obtain the following result. 




(12) 
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Theorem 6. Consider the time-inhomogeneous coalescent process with mutations described above. 
Let a = 7/(1 + 7) G (0, 1). We have 



and for all r € N, 



lim — - = — ; — ^ — in probability (13) 

n->oo n a sm(7ra) 



K nr 62 1 - a (l-a) a Tr aT(r-a) , , . 

lim — ~t = ~t \ vFn \ 111 Probability. (14) 

n->oc n a sin(7ra) r!l (1 — a) 



Of course, in view of Theorem [21 equation (|14p follows immediately from f 1 1 3 [) . so we need 
only prove (|13p , which we do in Section [5j 

Note that for both the beta coalescent and for the time-inhomogeneous coalescent described 
above, we have 

lim — — = ——7 \ in probability. (15) 

rwoo K n rT(l - a) 1 J y ' 

The left-hand side of (|15p is the fraction of blocks of the allelic partition having size r, and the 
sequence of numbers K n>r for 1 < r < n is often called the allele frequency spectrum. Thus, ([To]) 
says that we get the same allele frequency spectrum for these two models, as we would with any 
coalescent model having the property that K n grows like n a . 

One of the central goals of population genetics is to use information about a sample from 
a current population to obtain information about the history of the population. Distinguishing 
among various factors that could cause the genealogy of the population to differ from Kingman's 
coalescent can be challenging. See, for example, [12] and [16] for a discussion of the issue of dis- 
tinguishing the effects of natural selection from demographic factors such as changing population 
size. Therefore, from the perspective of population genetics, Theorem [6] is perhaps disappointing. 
Theorem [6] shows that the allele frequency spectrum that arises when the genealogy is given by 
a beta coalescent, as could be the case for populations with large family sizes, could also arise 
in a population whose size is increasing over time. Thus, one can not necessarily use the allele 
frequency spectrum to distinguish populations with large family sizes from populations that are 
increasing in size. In general, Proposition [T] and Theorem [2] suggest that the same allele frequency 
spectrum may arise in a wide variety of models, and thus may explain part of the difficulty in 
distinguishing among various factors that could cause the genealogy of a population to differ from 
Kingman's coalescent. 



2 Proof of Theorem [2] 

Throughout this section, we assume that < a < 1 and that II is an exchangeable random 
partition. We define K n and K n>r as in Theorem[2j We assume that £ : (0, 00) — > (0, 00) is a slowly 
varying function and that (jlj holds. We denote by P\ > P2 > ■ ■ ■ the asymptotic frequencies 
of the blocks of II. Note that @ implies that ]Cj*Li Pj = 1 a - s - because lim inf n ^.oo n~ l K n > 
almost surely on the event that Y^jLi Pj < x - For x > 0, define G(x) = max{j : Pj > x}, which 
is a random variable because the Pj are random. 

At times in the proof of Theorem [21 it will be useful to use a technique called Poissonization. 
Let (N(t),t > 0) be a rate one Poisson process, so that N(t) has the Poisson distribution with 
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mean t for all t. Define the random variable 



Likewise, for positive integers r, define 

$ r {t) = E[K N{%r \(P j )f =l ]. 
We have (see the proof of Proposition 17 in [TT]). 

poo 

$(t) = t / e~ tx G(x) dx a.s. (16) 
■/ o 



Also, 



t r ^ 



' i=i 

By conditioning on (ij)?2=i and applying Lemma 1 and Proposition 2 of |llj . we get 

lim - . . = 1 a.s. (18) 

n->oo $(n) 

Using the remarks following Proposition 2 of [TT], we have for all positive integers r, 

lim i^T = 1 a.s. (19) 

Lemma [7] below, known as Potter's Theorem, is Theorem 1.5.6(i) of [7J and gives some bounds 
on slowly varying functions. Note that since Theorem [2] only concerns the values oi£(n) for n € N, 
we may and will assume, here and throughout this section, that £ is bounded away from zero and 
infinity on (0, x] for any x > 0. 

Lemma 7. Suppose £ : (0, oo) — > (0, oo) is a slowly varying function. Let 5 > 0. There exists a 
positive number xq(5) such that if x > xq(5) and A > 1, then 

1 w, , mi 



<^T^<(1 + ^)A- (20) 



(1 + <5)A' 5 - £(a;) 

Also, there exists a constant C > suc/i i/iai £{x) > Cx~ s for all x > xq(5). 
Lemma 8. We have 

l-l—a f-oo 

t— 'too 

Proof. We use Poissonization. Combining and (|18p . we get 



^1— a rao 

lim / e~ tx G(x) dx = F(l — a) in probability. 



<fr(n) 

lim rr^- = T(l — a) in probability. 

-n Q £(n) J 



71— >00 



Since t \— > <3?(i) is nondecreasing and £ is slowly varying, it follows from Lemma [7] that 

lim — —r = r(l — a) in probability. 

<->oo t a £[t) 

The result now follows from (1161). □ 



8 



Lemma 9. We have 

j-l—a roo 

lim — — / e- tx (G(x) - x- a £{l/x)) dx = in probability. 
Proof. In view of Lemma [8l it suffices to show that 

4-\ — ol roo 

lim-—/ e- tx x- a l(l/x)dx = T(l-a). (21) 

t^oo £{t) Jq 



Choose 5 such that 5 + a < 1, and choose xq so that (|20j) holds for x > xq and A > 1. Substituting 
y = tx, we get 



t 



l—a poo 



£{t) 



roo i roc 

/ e- te x- a £(l/x) dx = — e-y y - a £(t/y) dy 
Jo *W Jo 



"v~° C-W-) 1i»<<m,i *» + tt* f°° e-"y- a «t/y) dy. (22) 

t/xo 

By (I20p . we have £(t/y)/£(t) < max{2t/~ 5 , 2y s } whenever t > x® and < y < t/xo. Also, since £ 
is a slowly varying function, lim^oo £(t/y)/£(t) = 1 for all y > 0. Therefore, by the Dominated 
Convergence Theorem, 



& r e ^~° ) w* ^ = r e " y " Q " y = r(i - a) - 



(23) 



Recall from Lemma [7] that there is a constant C such that £(t) > Ct 5 for all i > xq- By 
assumption, there is a constant B such that £(x) < B for < x < xq. Therefore, 

1 r°° Bt s / t \ a 

lim sup — / e~ y y~ a £(t/y) dy < lim sup — — e~ t/x ° = 0. (24) 

t^oo £{t) Jt/x t^oo C \X J 

Equation (HQ follows from (ggj, ([23]), and (JMH- □ 
Lemma 10. There exists a positive number Co such that if C > Co, then 

lim P(G(x) > C£(l/x)x- a ) = 0. 

x— >0 

Proof. Suppose G(x) > C£(l/x)x~ a . Since x i-> G(x) is nonincreasing, 

t i_ Q / e -ty G ^ dy > t l-a / e -*rc£(l/x)x- a dy = C{tx)~ a £{l/x){\ ~ e~ tx ). 
Jo Jo 



Therefore, if t = 1/x, then 



jl— a roo 



poo 

/ e- ty G{y)dy >C(\-e~ x ). 
Jo 



£{t) 

By Lemma[HJ the result follows with Co = T(l — a)/(l — e _1 ). □ 
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Lemma 11. There exists a positive number C such that if G{x) = G{x)\^Q^ >( j^i/ x ^ x -a^, then 



c& /"oo 

lim - . / e~ tx G(x) dx = in probability. 



Proof. Let e > 0. Choose C\ > Co, where Co is the constant from Lemma [101 and let C = 2 1+a C\. 
Choose an integer M large enough that Cx2- Ma e~ %M < e/2 and 2~ M ( 1 - Q )r(l - a)e < e/4. For 
t > 0, define the event 

A t = {G{2 k t~ l ) < C 1 £{2- k t)(2 k t- 1 )~ a for k = -M, —M + 1, . . . , M — 1, M}. 

By Lemma [Till there exists T\ < oo such that if t > Ti, then P (A^) > 1 — e/2. Because 
G(x) < G(x) < G(2 M t~ 1 ) for all x > 2 M t~ 1 , on the event A t we have 

a /"oo j.l—a /"OO 

— - / e" te G(x) < — - • C 1 £(2- M t)(2 M t- 1 )- Q / e" te 

*W J2 M t- 1 *W J2 M t' 1 

- Cl2 6 < 2- (25) 

Because £ is slowly varying, £(2~ M t)/£(t) — > 1 as i — > oo. Therefore, there exists a T2 such that 
for t > T2, on Aj we have 

xl— a /"oo 

— / e~ tx G(x)dx<-. (26) 

Also, on .A tj if 2H- 1 < x < 2 /c+1 ^ 1 for some integer k satisfying —M<k<M — l, then 

G(x) < G(2 k t~ 1 ) < Ci£(2- fc t)(2 fc t- 1 )-° < C7 1 £(2- fc t)(x/2)~ Q = 2 Q C^(2-' £ t)x- a . 

By Lemma there exists T3 < 00 such that if t > T3 and 2 k t~ 1 < x < 2 k+1 t~ l for some integer 
k satisfying —M < k < M - 1, then £(2- k t)/£(l/x) < 2. Therefore, if A 4 occurs and t > T 3 then 

G(x) < 2 1+a Gi^(l/x)x- a = C^(l/x)x- Q . 

In this case, G(x) = for 2~ A/ t~ 1 < x < 2 M t _1 and thus 

i 



£(t) 

If < x < 2- M t~ 1 , then e" te < 1 < e • e _2Mte . Therefore, 

t / _/t.^V/ n , , 61 / _9 M 



e" t:c G(x) dx = 0. (27) 



/ e" te G(x) dx < -— \ e~ 2 tx G{x) dx 

Jo H*) JO 



4— -w^r-"^)* 

£(2 M t)\ (2 M t) 1 ~ a ^ 



£ ifiM'VJwl, e G(x)fc (28) 
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By Lemma [8] with 2 M t in place of t, the portion of the right-hand side of (|28p after the parentheses 
converges in probability to T(l — a) as t — > oo. Also, because £ is slowly varying, we have 
£(2 M t)/£(t) — > 1 as t — > oo. Therefore, there exists T4 such that if t > T4, then 



t l-a r 2- M t- x 



e- tx G(x)dx> - \<\. (29) 



£(t) J w 27 2 

It follows from (j5B|> . (|2T|l . and pO")) that if t > max{Ti, T 2 , T 3 , T 4 }, then 

il— a /■oo 



/fl-a roo \ 



which implies the lemma. □ 
Lemma 12. We have 

i- ^G'(x) , . , , ... 
hm — ; — — — = 1 m probability. 

x^O 1(1/ x) 

Proof. Choose Cq as in Lemma [TUl and choose C > max{Co, 1} large enough that the conclusion 
of Lemma [TT1 holds. For x > 0, let 

Y(x) = min{G(x), C£(l/x)x- a } - X - a £(l/x). 

In view of Lemma [TUl it suffices to show that 

x a Y(x) 

lim — — — - = in probability. (30) 
z->o £(l/x) 

Note that |Y(a;)| < Cx~H(l/x) for all x > 0. By Lemmas [9] and dH 

lim — — / e~ tx Y(x) dx = in probability. (31) 

t^oo £(t) Jq 

We proceed by contradiction. Suppose ([30]) fails to hold. Then there exists < e < 1/2 and 
a sequence of positive numbers (s ra )^ =1 converging to zero such that one of the following holds: 

1. We have P(Y(s n ) > es- a £{l/s n )) > e for all n. 

2. We have P(Y(s n ) < -es- a £(l/s n )) > e for all n. 

Assume for now that we are in the first case, so P(Y(s n ) > es n Q ^(l/ s n)) > 6 f° r au n - If 
Y(s n ) > es~ a £(l/ s n ), then G(s n ) > (1 + e)s~ a £(l/ s n ). In this case, if x < s n , we have 

G(x) > G(s n ) >(1 + e)s- a £(l/s n ) = (1 + e) • x' a £(l/x). 

Choose 5 > small enough that (1 + £) _1 (1 + e) 1_Q_<5 > 1- If s„/(l + e) < x < s n and if n is 
large enough that 1/ s n > xq (5) , then by Lemma EJ 

*(V*») / , n , ™ n , ^ 



(l + <5)(l + e) 5 " £(l/x) 



< 2 77rfv < (i + *)(i + er. 
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Therefore, 



(1 + \l-a-8 

G(x) > 1 T, xn — x- a £(l/x) 



(1 + 5) 

It follows that for s„/(l + e) < x < s n , we have 

'(l + e) 1 -"- 5 



Y(x) > 



> 



(1 + 5) 
(1 + e) 1 -"-^ 

(1 + 5) 
r/ S -^(l/ Sn ), 



1 )x~ a £(l/x) 
1 



1 



(l + 5)(l + e)« 



r s-^(V»r. 



(32) 



where rj > 0. 

Let / : [0, oo) — > R be the function such that /(x) = if either x < 1/(1 + e) or x > 1, 
/((2 + e)/(2 + 2e)) = 1, and / is linear on the two intervals [1/(1 + e), (2 + e)/(2 + 2e)] and 
[(2 + e)/(2 + 2e), 1]. Note that 



f(x) dx 



1 



1/(1+*) 



1 



1 



1 + ey 2(1 + e) 



(33) 



Let A be the algebra of functions of the form <p(x) = a±e 1X + • • • + a m e~ tmX for x > 0, where m 
is a nonnegative integer, oi, . . . , a m G R, and ti, . . . ,t m > 1. By the Stone- Weierstrass Theorem 
(see, for example, Theorem D.23 on p. 346 of [8]), the set .4, is uniformly dense in the set 
Co([0, oo)) of continuous functions from [0, oo) to R that vanish at infinity. Therefore, if we 
choose £ = ery/(16r(l — a)C), then there is a function g £ A such that \g(x) — e x f(x)\ < C for 
all x > 0. Letting /i(x) = e~ x g(x) for x > 0, we have |/i(x) — /(x)| < (e~ x for all x > 0. Write 

g(x) = a\e~ tlX H + a m e~ tmX . 

Choose = min{e/2m,2 1 - a er ? /8(|ai| + - • - + |a m [)} > 0. By ((3TJ) we can choose n large enough 
that 2/s n > T, where for t >T we have 



/.l— a /-oo 



It follows that with probability at least 1 — m9, 



e~ tx Y(x) dx 



> < 



h(x/ s n )Y(x) dx 



»=i 

m 

£Ei< 

i=i 



-(ti+l)x/ 8 „y^ ^ 



ii + 1 



ct-1 



1=1 



l((t, + l)/s, 

1 *(l/s„) 



(34) 



Also, using (|21|) with l/s n in place of t, we have that for sufficiently large n, 



{f{x/s n ) - h{x/s n ))Y(x) dx 



< 



(e- x/Sn Cx- a £(l/x)dx 



< 2CT(1 - a)(s L n - a £(l/s n ) 



(35) 
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Since I is slowly varying, it follows from (|34p and (|35p that with probability at least 1 — m6 
1 



lim sup — 



f(x/s n )Y(x) dx 



o 



n^oo S n a £(l/s n ) 

However, (|32p and f|33[) imply that for sufficiently large n, with probability at least e, 



f(x/s n )Y(x)dx = I f(x/s n )Y(x) dx 



> r/s n Q ^(l/s n ) / f{x/s n )dx 

Js n /(l+e) 
"' l - a £(l/Sn), 



2(l + e) 

which contradicts (f36l) because m# < e. 

It remains now to consider the second case. Assume that P(V(s n ) < —es^ a £(l/s n )) > e for 
all n. If y(s n ) < — ts~ a £(l/s n ), then G(s n ) < (1 — e)s~ a £(l/ s n ). In this case, if x > s n , then 

G(x) < G(s n ) < (1 - eCT/^) = (1 - e) (£j^S^A . x -"t(i/ x ). 

Choose 5 > small enoug h that (1 + - e) l - a - & < 1. If <s n <C x <C Sn/ (1 and if ti is large 
enough that (1 — e)/s n > xo(5), then by Lemma [3 

(1 - e) s < ^(l/£j < 1 + 5 



1 + 5 " 1(1 fx) ~ (1 -e) 5 ' 
Therefore, 

G(x) < (1 + <5)(1 - e) 1 - Q - <5 x- a ^(l/x). 
It follows that for s n < x < s n /(l — e), we have 

y(x) < ((1 + <5)(1 - e ) l - a - 5 - l)x- a l(l/x) 

< ((1 + 6)(l - e) 1 """ 5 - 1)(1 + 5y\l - e) a+s s- a e(l/s n ) 

= - V s- a £(l/s n ), (37) 

where r] > 0. 

This time, let / : [0, oo) — > K be the function such that f(x) = if x < 1 or x > 1/(1 — e), 
/((2 - e)/(2 - 2e)) = 1, and / is linear on [1, (2 - e)/(2 - 2e)] and [(2 - e)/(2 - 2e), 1/(1 - e)]. 
We have 

.... i/i .\ 



/(x)<fa = .^__lj___. ( 38 ) 

Define 5 and as in the previous case. Then ([33]) . ([3"5]) . and ([3UP hold as before. However, ([3"T]) 
and (|38p imply that for sufficiently large n, with probability at least e, 

00 t-s n /(l~e) 

f(x/s n )Y(x)dx= / f(x/s n )Y(x)dx 



•J S n 

ps n /{l-e) 

< -7]s~ a £(l/s n ) / f(x/s n ) dx 

f ' s l ~ a £{s n ) (39) 



2(1 - e 
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which again contradicts (|36p because m6 < e. □ 

Proof of Theorem^ Fix r € N. It follows from (jU and fjltJj) that given e > 0, for sufficiently 
large n we have 



P 



and 



P 



s=r s=r 

oo oo 

E - E *< 



<|n«£(n)J >1-| 



s=r+l s=r+l 

Subtracting f)41[) from (|40p gives that 

if,. 



$ r (n) 



<|n tt £(n)J >1-| 



< e > 1 -e 



(40) 
(41) 



n a £(n) n a £(n 

for sufficiently large n, Therefore, it suffices to show that 

$ r (t) ar(r - a) . 
lim = : in probability. 



t->oo t a £(t) 



r! 



(42) 



that 



Let 6* > be arbitrary. Because X^i «r(r — a)/r! = T(l — a), we can choose ./V large enough 



N 



^ ^^^ >F(l-a) 



r=l 



ri 



(43) 



Let r\ = mm{6/(N + 1), 0/(4T(l — a))}. Note that we can choose a sufficiently large integer L, 
then a sufficiently small positive number 5 (much smaller than 1/L), then a sufficiently large 
integer M (much larger than 1/5), then a sufficiently small positive number e (much smaller than 
l/M) such that 



/ T \ r f AfM\ r s ( M + l ) 



(44) 



for 1 < r < N. By Lemma [T2"| we can choose T\ > sufficiently large that if i > Ti, then 

P((l - e)x _t ^(l/x) < G{x) < (1 + e)x^(l/x) for x = LS/t, (L + l)5/t, . . . , M<5/t) > 1 - rj. 
By Lemma El we can choose T<i > sufficiently large that if i > T2 and L<5/i < x < MS/t, then 

*(l/x) 



1 - e < 



*(t) 



< 1 + e. 



14 



If t > max{Ti,T2}, then with probability at least 1 — r], we have, using (|17|) 



Ml _ tr ~ a y p V - t p, 

* 5^ E (y) V^cw*) - + w*)) 

For fe < M - 1, 

_ ii+^l! = (1 _ c) * fJ- + -J— ((1 - e ) 2 - (1 + e) 2 ) 



a(l-e) 2 4e 



(fc + l)^ 1 (A: + 1) Q 

(<»-*-¥)■ 



a 

> 



(fc + l) a+1 

Therefore, if 1 < r < N and t > max{Ti,T2}, then with probability at least 1 — 7/, 



Ml > fri - ^ _ 4eM\ «^ fc r (k+1]s 

~ [} ' a J r! ^ (fc + l)«+i 6 



i a £(t) 

If r > 2 and k > L then 

^ p -(fc+l)3 > f L Vr/. I o^-q-l -(fc+l)a > f L V [ k+2 r-a-1 -Sx 

and if r = 1 and k > L then 



lk+1 



(fc + 1)^ 1 - \L + lJ y ^ ±J ° - \L + 2 

Thus, if 1 < r < N and t > max{Ti, T2}, then with probability at least 1 — 77, we have 

t<*£(t) ~ \L + 2) V V ' a ) r\ J L+l 



(l_ e )2 / ^ r-o-l^ 

L + 2J \ a Jr\J S{L+l) 

> (l-^r(r-a) > (46) 
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where the last inequality uses 

Since t i— > <3?(i) is nondecreasing and I is slowly varying, @ and (|18p imply that &(t)/(t a £(t)) 
converges in probability to T(l — a) as t — > oo. Therefore, there exists T3 such that if t > T3, 
then 

<i>!/ ) < (l + r/)r(l-Q)^) > I — 77. (47) 



t Q £(t) 

Therefore, combining (|46l) and (|47j> . if 1 < r < N and i > max{Ti, T%, T3}, then with probability 
at least 1 - [N + 1)77 > 1 - 9, 

*r(t) < ' ' 



M(t) - ( 4)(* (t) -g> (() ) 

<(i + r?) r(i-a)-(i-,)f:^f^ 



oT(s — a) / ^> ar(s — a) 



^/ \ \ - aT(s — a) / . . \ 

=r(i-a)-x;— V^ +r? r d-«)+E 

S = l ' ^ 8=1 

< — ^-j " + 5 + 2r/T 1 - a 

r! 2 

<^f^+*. (48) 
r! 



using (|43p . The result (|42p for r = 1, . . . , N now follows from (146j) and (|48|) . Since A?" can be 
taken to be arbitrarily large, the result holds for all positive integers r. □ 

3 Description of Example [3] 

We specify a random sequence P\ > Pi > . . . such that X^j=i -^j = 1 a - s - m the following way: 

1. Begin with any deterministic sequence q\ > qi > . . . such that Y^jLiQj < 1/2 and such 
that if g(x) = max{j : qj > x}, then lim x ^.Q x a g(x) = 1. 

2. Given a positive integer m, we can define, for k > 2, the integer = 2 . Choose ni 
large enough that X)fc=i n fc _1 < V^- 

3. Define a sequence of independent random variables (Rk)^ =1 such that has the uniform 
distribution on {1, 2, . . . , n^} for all fc. Then for all k £ N, add the number l/(nfc2 2flfc ) to 
the sequence |_2 2 times. 

4. Add the number 

00 00 ^ 

to the sequence to make the numbers sum to one. 
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5. Order the numbers and relabel them Pi > P2 > . . . . 

Using the method described in the introduction, define an exchangeable random partition II 
whose asymptotic block frequencies are almost surely given by this sequence (Pj)'^L 1 . The next 
two lemmas show that II satisfies the conditions of Example El 

Lemma 13. For the sequence (Pj) c *L l defined above, if we define G(x) = max{j : Pj > x}, then 

lim x a G(x) = 1 in probability. 

Proof. Let G'(x) denote the number of terms that were added to the sequence in step 3 of the 
above construction that are greater than or equal to x. Because hin^o x a g(x) = 1 by step 1 of 
the construction, it suffices to show that 

lim x a G'(x) = in probability. (49) 

x— >0 

Let e > 0. Suppose 1/rik+t < x < l/n^. Because Rj < nj, there can be at most 2 2 3 nf* terms in 
the sequence that equal l/(rij2 2 3 ) for j = 1, . . . , k — 1. Therefore, 

G'(x)<g2 2 " J n« + 2^<l {i/(nfc22 « fc)> ^. (50) 

By the choice of rik, we have 

^ 3 - 2 2 

3=1 

for sufficiently large k. The second term on the right-hand side of (|50p will be at most {e/2)x~ a 
unless we have both l/(n fc 2 2 fc ) > x and 2 2 ft n° > (e/2) x a or, equivalently, unless 

log 2 log 2 ( 6 - ) < R k < log 2 log 2 ( — 

Because R k has a uniform distribution on {1, . . . , n^.}, the probability that Rk falls in this interval 
is at most 

1 f. , , ( 1 



1 + log 2 log 2 - log 2 log 2 — — . (51) 

n k \ \xn k J \2x a nlJ J 

Note that for all real numbers z > 1, we have 

log 2 log 2 Z - log 2 log 2 Z a = log 2 f ^^a ) = l0g 2 f ~Y 

By applying this result when z = l/(xnk), we see that the probability in (15ip tends to zero as 
k 00. It follows that lim^oo P(G'(x) > ex~ a ) = for all e > 0, and (02) follows. □ 

Lemma 14. For t/ie random partition II defined above, if Tl n denotes the restriction of H to 
{1, . . . ,n} and X n denotes the number of blocks ofIl n , then there exists a constant C > such 
that 

lim P{n- a K nk > r(l - a) + C) = 1. 
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Proof. We use Poissonization. Let (N(t),t > 0) be a rate one Poisson process, and let 3>(i) = 
^[-KjV^K-FjOj^i]- By (fTHj) . it suffices to show that there is a C > such that 

liminf n~ Q $(n fc ) > T(l - a) + C a.s. (52) 

k— >oo 

For all € N, designate |_2 2flfen fcJ blocks of IT with asymptotic frequency l/(nk2 2Rk ) as marked 
blocks, while the other blocks of II will be unmarked. If there are more than [2 2 k n%\ blocks with 
asymptotic frequency l/(rafc2 2flfc ) because qj = l/(rafc2 2flfc ) for some j, then choose at random the 
blocks to mark. Note that the marked blocks correspond to the terms that were added in step 

3 of the above construction. The unmarked blocks all have asymptotic frequency qj for some j, 
except for the block added in step 4 of the construction. Let &'(t) be the expected number of 
marked blocks of IIjy( t ) conditional on (Pj)'jLi, an d let be the expected number of unmarked 
blocks of n_v( t ) conditional on (Pj)°° =v Note that $(i) = <S>'(t) + $"(t). By Proposition Q] and 
(fl~8|) . we have 

lim n- Q $"(n fc ) = T(l - a) a.s. (53) 

k—^oo 

The number of integers in the set {1,... ,N(nk)} that are in a block of II with asymptotic 
frequency l/(nfc2 2r ) has a Poisson distribution with mean 2 -2 ' . Therefore, on the event {Rk = r}, 
we have 

*'(n fc )> V2 2r nt\{l-e- 2 - 2T ). 

Since x _1 (l — e~ x ) is bounded away from zero for all x < 1/4, it follows that there is a constant 
C > such that n^ a $'(n k ) > C a.s. for all k. This fact, combined with ([53]), implies ([52]). □ 

4 Description of Example [5] 

We begin by specifying a deterministic sequence of numbers pi > P2 > . . . such that Y^jLi Pi = 1 
as follows: 

1. Begin with any sequence q\ > q% > . . . such that if g(x) = max{j : qj > x}, then 

lim x(log x) 2 g(x) = 1. (54) 
x-»-0 

It is not difficult to see that such sequences exist. One arises, for example, in [3]. 

2. Choose any integer j such that 

OO CO 

£ g fe <l-J>- 9 / 2 . 

k=j+l n=2 

Then remove the terms q\, . . . qj from the sequence. 

3. For all n > 2, add the number e - " 3 to the list [tt. - 9//2 e"' 3 J times. 

4. Add the number 

OO OO 

l - E *-E e " n3 L^ 9/2 e n3 j 

k=j+l n=2 

to the sequence to make the numbers in the new sequence sum to one. 
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5. Order the numbers and relabel them p\ > p% > . . . . 

Using the method described in the introduction, define an exchangeable random partition II 
whose asymptotic block frequencies are almost surely given by this sequence {pj)J^i- The next 
two lemmas establish that II satisfies the conditions of Example [5j 

Lemma 15. For the random partition II defined above, if Yi n denotes the restriction of II to 
{1, . . . , n} and K n denotes the number of blocks ofH n , then 

lim {l ° gn)Kn = 1 a.s. 

n— >oo n 

Proof. We again use Poissonization. Let (N(t),t > 0) be a rate one Poisson process, and let 
$(i) = E[K N{t) ]. By (US]), it suffices to show that 

lim MW) = L (55) 



For all n > 2, designate |_ n 9 / 2 e™ J blocks of II with asymptotic frequency e n as marked 
blocks, while the others are unmarked blocks. If there are more than |n~ 9 / 2 e n J blocks with 

3 3 

asymptotic frequency e~ n because q^ = e~ n for some k, then choose at random the blocks to 
mark. Note that the marked blocks correspond to the terms pk that were added in step 3 of 
the construction above. The unmarked blocks all have asymptotic frequency q^ for some k > j, 
except for the one unmarked block that is added in step 4 of the construction. Let $'(£) be the 
expected number of marked blocks of Iljvft) j an d let $"(f) be the expected number of unmarked 
blocks of Ilwm. Note that = <&'(i) + $"(*). In view of ([Ml), we can apply Proposition [I] with 
£(t) = (logt) for t > 1 in combination with (|18p to get 

lim "■"""» = 1. (56) 

t^OO t 

That qi, . . . ,qj were deleted and one unmarked block was added does not affect this conclusion. 

Now, choose t such that e*-™ -1 -* 3 < t < e™ 3 . The number of marked blocks of II with asymptotic 
frequency at least e~( ra_1 ) is 

n-l 

fc=i 

where Ci is a positive constant that does not depend on n. This bound holds because the sum is 
dominated by the largest term. If a block of II has asymptotic frequency q, then the probability 
that at least one of the first N(t) integers is in the block is 1 — e~ qt < qt. Therefore, the expected 

3 

number of marked blocks of Unu) with asymptotic frequency e~ n or smaller is at most 

oo oo 

]T(e- fc3 t) • k~ 9 / 2 e k3 =tY,k- 9/2 < C 2 n- 7 /\ 

k=n k=n 

where C2 is another positive constant that does not depend on n. Therefore, < Cin" 9 / 2 t + 
Since logt < n 3 , it follows that 

anfrgWO-Q. (57) 

t— >oo t 

Now (ESI follows from (1571) and (1551). □ 
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Lemma 16. For the random partition II defined above, if Tl n denotes the restriction of II to 
{1, . . . ,n} and K nr denotes the number of blocks of Tl n of size r, then for r > 2, the quantity 
n _1 (log n) 2 K n ^ r does not converge to l/[r(r — 1)] in probability as n — >• oo. 



Proof. We consider the sequence (K\ mn \ r )^ =1 , where m n = e n for all n. Let (N(t),t > 0) be a 
rate one Poisson process. There are at least |n -9//2 e n3 J blocks of II with asymptotic frequency 

3 

e~ n . Order these blocks at random, and then let Ai )U be the event that the ith of these blocks con- 
tains exactly r of the integers 1, . . . , N(m n ). Because the number of the integers {1, . . . , N(n m )} 
in one of these blocks has a Poisson distribution with mean 1, we have P(A^ n ) = e _1 /r! for all 
i and n. Also, for any n, the events Ai jTl for 1 < i < [n~ 9 / 2 e ni \ are independent. It follows that 
for all n, the random variable K N ^ mn ^ r stochastically dominates a Binomial(|_n -9//2 e n3 J, e~ 1 /rl) 
random variable. It now follows from standard large deviations estimates that 

2e- x 



lim Pi K N{mn) , r > ^n-^e" 3 ) = 1. (58) 



Because N(m n ) has the Poisson distribution with mean m n , we have Var(iV(m n )) = m n and 

1/2 

therefore E[\N(m n ) — m n \] < m n . Since \K N ( mn \ r — K^ mn ^ r \ < \N{m n )— [m n \\, it follows that 
E{\K N ( mn ^ r — 2iri m i r |] < e n I 2 + 1. Combining this result with Markov's inequality gives 

^p{\K N{rnn)%r - K [mnlr \ > e —n- 9 l 2 e nA ^j = 0. (59) 
Combining (j58|) and ([59]) gives 

lim P ( K Vmnlr > ^n-V 2 eA = 1. 



Since m n (log m n ) 2 = n 6 e n3 , the result follows. □ 



5 Proof of Theorem [6] 

We will assume that (^ n (t),t > 0) is obtained from Kingman's coalescent (Q n (t),t > 0) as in 
(|12p . For any partition tt of {1, . . . , n}, let |7r| denote the number of blocks of tt. For 1 < k < n, 
let T k = inf{t : |6 n (t)| = k}. 

Lemma 17. For all e > 0, there exists a positive constant C such that with probability at least 
1 — e, we have 

T k ~(---) 
\k n J 

for all integers k such that n 3//4 < k < n. 

Proof. If 2 < k < n, then T^_\ — has an exponential distribution with rate (2). Since T n = 0, 
it follows that 

BW = E *Pi-.-T,i= E E (j^i-j) = e-s- 



< 



C 

9/8 
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For 1 < k < n, let Y k = T k - E[T k \. Note that Y k _i - Y k = T k _ x - T k - 2/[k(k - 1)], and these 
increments are independent. Therefore, 

n n n a r~< 

Var(Y fc )= Var(T i _ 1 -y i )= £ Var(Tj_i —Tj) = £ _ < ^ 

j=k+l j=k+l j=k+l J ^ ' 

for some positive constant C\. By Kolmogorov's Maximal Inequality, 

k 3 d Ci 



which is less than e if we take C sufficiently large. The result follows by taking k = [~n 3 / 4 ] , in 
which case C/k 3 / 2 < C/n 9 / 8 . □ 

For 1 < k < n, let U k = inf{t : |\l/ n (i)| = k}. Define the function g : [0, oo) — > [0, oo) by 
g(t) = (1 - a)-^ l - a h 1 ~ a , where a = 7/(1 + 7). It follows from CE} that for all t > 0, 

/a(T) 7+1 \ / +(i-«)(7+i) \ 

•„toM> = = e~( (1 _ a . )(1 _„„ 7+1)(7 + 1) ) - e.w. 

Therefore, [/& = <?(Tfc) for all fc. 
Let 

n 

L n = ^ -C/fc). 

fc=2 

Note that L n is the sum of the lengths of all branches in the coalescent tree because U k -i — U k 
is the amount of time for which there are exactly k lineages. Let m = [n 3 / 4 ] + 1, and let 

n 

L' n =J2 k (Uk-i ~ U k ), 

k=m 

which is the total length of all branches in the coalescent tree when the tree is truncated at the 
point where the number of lineages reaches [~n 3 / 4 ] . 

Lemma 18. We have 

.. L' 2 1 - a {l-a) a TT , 

hm — = m probability. 

n->oo n a sin(7ra) 

Proof. Let e > 0. By Lemma \T7\ there is a constant C such that with probability 1 — e, we have 

2 2 C 2 2 C 

<T k <--- + 



k n n 9 / 8 k n n 9 / 8 

whenever n 3//4 < k < n. Note that U n = and L' n is an increasing function of U k for m — 1 < 
k < n — 1. Therefore, with probability at least 1 — e, 

n 

L' n =Y J Kg(Tk-i)-g{T k )) 



k=m 

< 



v^,//2 2 C \ 2 2 C 



k=m 



k — 1 n n 9 / 8 / \k n n 9 / 8 

^M^-l + ^) + n f k {kh-l) 9 ii-^ + ^rs)^ (so) 

k=m 
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where the last equality uses that g'(t) is a decreasing function of t because < a < 1. The 
first term on the right-hand side of (|60p is 0(n 1 ^ 9 ^ 1 ~ a ^ 8 ) and therefore is o(n a ). Since g'(t) = 
(1 — a) a t~ a , the second term on the right-hand side of (I60p is equal to 



k=m k=m 

For all k such that m < k < n — 1, 

i /i iv Q a + i /i iy Q < m + i /~ fc+1 i/i r <h 



k — l\k n J \k — 1 J k + 1 \k n J m — 1 J k x\x n 

Therefore, the second term on the right-hand side of (|6Up is at most 

\m + 1 / Jo x \ x n ) 
By making the substitution y = x/n, we get 

-(---) a dx = n a ! -(--l) a dy = n a f y a ~\l - y)~ a dy 
o x\x nj Jo y\y ) Jo 

ry 

TTT) 

= n a T(a)T(l -a) = r , (62) 

sm(7ra) 

where the last step uses Euler's Reflection Formula (see, for example, p. 9 of [lj). Therefore, 
there exists a sequence (a n )^ =1 tending to zero such that with probability at least 1 — e, 

n a sm(7raj 

. r . . ... ... . / .7 9 / 8 

probability at least 1 — e, we have 



L' n =J2 k (9(T k - 1 )-g(T k )) 

2 2 C \ (2 2 C 



k=m 
M 

> 

k=m 
M 



k — 1 n n 9 / 8 



k=m 
M 



k=m 

m — 1 x ' 



2/fc 


- 2/n 




2 2 


-( 


fc n 


2 


c 1 


n 
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For all k such that m — 1 < k < M — 1, 

1/1 1\~ Q fk-l\ 1 /l 1 V° > m-2 /- fc 



fc ny \ Jk — l\k nj m — 1 J^-i % \ x n 

Since m/n — > and M/n — > 1 as n — > oo, it now follows from (|62p that there is a sequence 
(frn)^Li tending to zero such that with probability at least 1 — e, 

^> 2 '"°."-f' - t ». (64, 
n Q sm to 



The result now follows from (|63p and (|64p . □ 
Lemma 19. VFe /iaue 

.. L n 2 1 - Q (l-a) a ^ . 

hm — = m probability. 

n->oo n a sin(7ra) 

Proof. By Lemma [TH1 it suffices to show that 

L. — V 

lim — = in probability. (65) 

n— >oo n a 

We have 

m— 1 m—1 

L n - = H9(T k -i) - 9{T k )) < W( T m-l)(Tk-i - T k ). 

k=2 k=2 

Let A be the event that T m _i > 2/(m — 1) — 2/n — C/n 9 ^ 8 , which has probability at least 1 — e 
by Lemma [TTJ There is a positive constant C 2 such that g'(T m ^i) < C^n 3 "/ 4 on A for all n. 
Therefore, 

m—1 m—1 „ 

£[L n - L' n \A] < C 2 n 3a / 4 kE i T k-i ~ T k ] < C 2 n^ A £ — < C 2 n 3a / 4 (1 + logn). 

fc=2 fc=2 

Thus, by Markov's Inequality, 

P(L n - L' > en a ) < P(A C ) + E[Ln - L ' n ^ < e + ^n" Q / 4 (l + log n), 

en a e 

which is less than 2e for sufficiently large n. The result follows. □ 

Recall that L n is the sum of the lengths of all branches in the coalescent tree. Also, recall that 
mutations occur along each branch of the coalescent tree at times of a Poisson process of rate 
9. Therefore, if we denote by S n the number of mutations in the tree, then conditional on L n , 
the distribution of S n is Poisson with mean 6L n . Thus, Lemma [191 and Chebyshev's Inequality 
immediately yield the following result. 



Corollary 20. We have 



lim — = ^7 ^- — in probability. 

rwoo n a sin(7ra) 
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Theorem [6] now follows from Corollary 1201 and the next lemma. 



Lemma 21. ^ 

lim — — = in probability. 

n— >oo n a 

Proof. Note that if the most recent mutation inherited by two sampled individuals is the same, 
then all of the mutations inherited by these individuals must be the same. This is because 
when we follow the two lineages backwards in time, they must coalesce before any mutations are 
observed. Therefore, each block of the allelic partition II n can be associated with a mutation that 
is the most recent mutation inherited by the individuals in that block, with the possible exception 
of one block corresponding to individuals with no mutations. It follows that K n < S n + 1. 

To get a bound in the other direction, note that the only mutations that are not associated 
with a block of the allelic partition as above are the mutations that are not the most recent 
mutation inherited by any individual. We denote the number of such mutations by B n . Then 
K n > S n — B n , so it suffices to show that B n /n a converges in probability to zero as n — > oo. 

Let R n denote the number of mutations that occur when the number of lineages is or 
fewer. Enumerate the remaining mutations in decreasing order of time, so that the first mutation 
is the most recent one, the second mutation is the second most recent, and so on. Let Rkn denote 
the number of mutations along the branch of the coalescent tree that we get by starting at the 
kth mutation and following this lineage back until time 

2 2 C 



9 1 — T-- + 



m — 1 n n 9 / 8 

where C is the constant from Lemma[T71 Choose C3 > 82 1 ~ a (l — a) a 7r/(sin(7ra)). On the event 
that T m _i < 2/(m — 1) — 2/n + C/n 9//8 , which has probability at least 1 — e by Lemma [T71 and 
on the event that S n < C^n , which has probability tending to one as n — > 00 by Corollary 1201 
we have 

Lc 3 n«j 

B n <Rn+ Yl R k,n- ( 66 ) 
fe=l 

Conditional on L n and L' n , the distribution of R n is Poisson with mean 6{L n — L' n ). Therefore, 
by (|65p . R n /n a converges to zero in probability as n — > 00. Because mutations occur along each 
lineage at rate 9, we have for all k < \C^,n a \^ 

E[Rk,n] < 0g(^~ + C 4 n-^-^ 
\m — 1 n n 9 / 8 / 

for some positive constant C4. By summing over k and then applying Markov's Inequality, we 
get that 

lim — > n = in probability. 

n— >oo n a — 4 ' 
k=l 

Since (I66h holds with probability at least 1 — 2e for sufficiently large n, the result follows. □ 
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